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Abstract We describe the implementation details of 
the colour reconnection model in the event generator 
HERWIG++. We study the impact on final-state observ- 
ables in detail and confirm the model idea from colour 
preconfinement on the basis of studies within the clus- 
ter hadronization model. Moreover, we show that the 
description of minimum bias and underlying event data 
at the LHC is improved with this model and present re- 
sults of a tune to available data. 

Keywords Monte Carlo • Hadron Collisions • 
Quantum Chromodynamics • Non-Pcrturbative 
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1 Introduction 

High-energy hadronic collisions at the Large Hadron 
Collider (LHC) require a sound understanding of soft 
aspects of the collisions. All hard collisions are ac- 
companied by the underlying event (UE) which adds 
hadronic activity in all phase space regions. The physics 
of the underlying event is similar to the physics in min- 
imum bias (MB) interactions and very important to 
understand to quantify the impact of pile-up in high- 
luminosity runs at the LHC. A wide range of measure- 
ments at the Tevatron and the LHC gives us a good pic- 
ture of MB interactions and the UE [T-13 . Data has 
also shown that a good part of the underlying event 
is due to hard multiple partonic interactions (MPI). 
By now, the three major Monte Carlo event generators 
Herwig (141, Pythia fTslfTel and Sherpa flTl have 
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an MPI model implemented to simulate the underlying 
event. 

Such a model of independent multiple partonic in- 



teractions was first implemented in Pythia 18 where 



its relevance for a description of hadron collider data 
was immediately shown. On a similar physics basis, 
but with some differences in the detailed modelling 
the JIMMY add-on to the old Herwig program, was 
introduced 
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In these models, the average number 
of additional hard scatters is calculated from a few in- 
put parameters and then for each hard event the ad- 
ditional number of hard scatters is sampled. The in- 
dividual scatters in turn are modelled similarly to the 
primary hard scatters from QCD 2 — > 2 interactions 
at leading order, with parton shower and hadronization 
applied as usual. The current underlying event model in 
Sherp a [Tt] is similar but will be replaced by a new ap- 
proach [20| . The current model in Pythia differs from 
the original development in some details and follows 
the idea of interleaved partonic interactions and show- 
ering 
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In the recent releases of Herwig an MPI model is 
also included 23 . It comes with two main parameters, 
the minimum transverse momentum p™™ of the addi- 
tional hard scatters and the parameter /i^, that can be 
understood as the typical inverse proton radius squared 
and appears in the spatial transverse overlap of the in- 
coming hadrons. Good agreement with Tevatron data 
was found with this model. Soft interactions were added 
to this model in order to improve consistency with more 
general theoretical input as the total cross section and 
the elastic slope parameter in high-energy hadronic col- 
lisions ^4]. The distribution of transverse momenta in 
the non-perturbative region below p™™ was modelled 
similarly to the proposal in 25 . Furthermore, it is as- 



sumed that the soft partons are distributed differently 



2 



from the hard partons inside the hadron. The additional 
parameters introduced here are fixed by requiring a de- 
scription of the total cross section and the slope pa- 
rameter, so we are still left with only two parameters. 
Once again, a good description of Tevatron data on the 
UE was found, now also where softer interactions play 
a role. The model for soft interactions smoothly extrap- 
olates from the perturbative into the non-perturbative 
region, similar to a model for intrinsic transverse mo- 
mentum in initial-state radiation [26] , 

With the advent of new data from the LHC at 
900 GeV 3 we also considered new observables and 
found distinct disagreement with data, e.g. in the pseu- 
dorapidity of charged particles. It was clear that our 
implementation was incomplete as we have not at all 
tried to modify the relative colour structure of the mul- 
tiple hard scatters. In Fig. [l] we show the sensitivity to 
the parameter pdisrupt , which controls the colour struc- 
ture of soft scatters and see a partial refill of the central 
rapidity plateau. This notable dependence on Pdisrupt of 
soft scatters hints at the importance of colour correla- 
tions in a more complete model. Furthermore, we stud- 
ied the dependence on other possible sources, e.g. on the 
parton distribution functions (PDF), which are used 
to extract the additional partons from the hadrons. In 
Fig. [2] we show the pseudorapidity of charged parti- 
cles and the average transverse momentum as a func- 
tion of particle multiphcity, {p±){Nch), at that stage. 
The lines represent different settings of the parame- 
ter of soft colour disruption and two different PDF 
sets: CTEQ6L1 [27* and MRST LO** 28]. We stress 
that all settings gave a good description of the Teva- 
tron UE data. As discussed in more detail in 
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even a dedicated tuning of the MPI model parameters 
did not improve this description, which lead us to in- 
clude a colour reconnection (CR) model in order to im- 
prove the colour structure between various hard scat- 
ters in the MPI model. The starting point is the idea 
of colour preconfinement 32 . While in a single hard 



interaction the colour structure is given by (the leading 
part of) the colour matrices that appear in the Feyn- 
man diagrams and also by the parton shower evolution, 
there is no such firm prescription for the assignment of 
colour lines or colour connections between individual 
hard scatters. Colour preconfinement leads us to the 
assumption that hard jets emerging from separate hard 
scatters should end up colour-connected when they are 
produced nearby in momentum space. As there is no 
such correlation in the non-perturbative modelling of 
the multiple hard interactions, we have to impose a 
model on it. Studies of such a model were carried out 



physical picture with various analyses of the modelled 
hadronic final state. Finally, we present results of tun- 
ing this model to the currently available data on MB 
interactions and the UE. 



2 Modelling colour reconnections 



earlier in 33 -35 . In this paper we describe the details 



of such a colour reconnection model and confirm this 



The cluster hadronization model 36 is based on planar 
diagram theory j37| : The dominant colour structure of 
QCD diagrams in the perturbation expansion in 
can be represented in a planar form using colour lines, 
which is commonly known as the Nc od limit. The 
resulting colour topology in Monte Carlo events with 
partons in the final state features open colour lines af- 
ter the parton showers. Following a non-perturbative 
isotropic decay of any left gluons in the parton jets to 
light quark-antiquark pairs, the event finally consists of 
colour-connected partons in colour triplet or anti-triplet 
states. These parton pairs form colour-singlet clusters. 

In dijet production via e~^e~ annihilation the invari- 
ant mass spectrum of these clusters is independent of 
the scale of the hard process [36p8] . The mass distribu- 
tion peaks at small values, 0(1 GeV), and quickly falls 
off at higher masses. Descriptively speaking, the cluster 
constituents tend to be close in momentum space. This 
property of perturbative QCD is referred to as colour 
preconfinement, as already stated above. The invari- 
ant cluster mass largely consists of the constituent rest 
masses, which gives rise to a pronounced peak at the 
parton rest mass threshold. Hence, clusters are inter- 
preted as highly excited pre-hadronic states. In the clus- 
ter hadronization model hadrons normally arise from 
non-perturbative, isotropic cluster decays. The Her- 
WIG implementation of this hadronization model is de- 
scribed in more detail in Ref . [14] . 

The situation in hadron collisions is necessarily 
more complicated. In a typical QCD 2-^-2 scatter, 
there is QCD radiation from the initial-state parton 
shower accompanied by jets emerging from outgoing 
partons. Due to colour charge conservation, there are 
colour connections between the partonic subprocess and 
the two hadron remnants. As sketched in Fig. [3j the 
primary hard subprocess is modelled in Herwig as an 
interaction of two valence (anti)quarks [l4]. Hence, in 
PP (pp) collisions the hadron remnants are colour anti- 
triplets (triplets). The typical length scale of the valence 
parton extraction is the hadron size, 0(1 fm), corre- 
sponding to energies where perturbation theory is not 
applicable. Thus, perturbative QCD cannot be used to 
calculate or assess the colour correlation between the 
partonic subprocess and the beam remnants. 

We face a similar situation if we consider multiple 
parton interactions in single hadron collisions. The MPI 
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Fig. 1 Comparison of Herwig 2.4.2 (without CR) to ATLAS minimum-bias distributions at = 0.9 TeV with A^ch > 2, 
p± > 500 MeV and \rj\ < 2.5. The Herwig results are obtained by using three different values for Pdisrupt: 0.0,0.5 and 1.0. 
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Fig. 2 Dependence on the choice of the PDF set. The shown observables are the same as already introduced in Fig. [l] We 
show results from two parameter points of the MPI model. For each point, two different PDF sets are selected, CTEQ6L1 and 
MRST LO**. All settings give a satisfactory description of the Tevatron underlying-event data. 



model in Herwig equips the event with a number of 
further QCD parton scatters, in addition to the pri- 
mary partonic subprocess. For each of these subpro- 
cesses a pair of gluons, initiating the scatter, is ex- 
tracted from the colliding hadrons. The chosen colour 
topology for this extraction corresponds to the 00 
limit. As stated above, this limit is justified in perturba- 
tive branchings. In non-perturbative regimes, however, 
it is rather a QCD-motivated model than an assessable 
approximation. 

As can be seen in the sketch in Fig. [7] below, 
the parton extraction model for the first and possi- 
ble additional partonic subprocesses introduces colour 
lines, which connect subprocesses to each other and 
to the hadron remnants. As a result, clusters emerge 
in hadronic collisions which link different parts of the 
hadron collision. Clearly, these clusters cannot be ex- 



pected to feature the same invariant-mass distribution 
as the clusters in e+e^ dijet events do. Yet the cluster 
hadronization model for hadronic collisions is adopted 
unchanged. Colour reconnection intervenes at the stage 
right before hadrons are generated from the clusters. It 
provides the possibility to create clusters in a way which 
does not strictly follow the actual colour topology: The 
ends of the colour lines are reconnected, resulting in 
a different cluster configuration. This rearrangement of 
colour charges is pictorially shown in Fig. |4j Based on 
the successful role of preconfinement in e'^e~ collisions, 
we designed two colour reconnection models to work 
out colour singlets with invariant masses smaller than 
a priori given. The colour reconnection models studied 
in this paper differ in the underlying algorithm to find 
alternative cluster configurations. 
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Fig. 4 Formation of clusters, 
which we represent by ovals here. 
Colour lines are dashed. The left 
diagram shows colour-singlet clus- 
ters formed according to the dom- 
inating colour structure in the 
1/Nc expansion. The right di- 
agram shows a possible colour- 
reconnected state: the partons of 
the clusters A and B are arranged 
in new clusters, C and D. 
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Fig. 3 For the hard subprocess a valence quark v is extracted 
from the proton. Since the valence quark parton distribu- 
tion functions dominate at large momentum fractions x and 
small scales , the initial-state shower, which is generated 
backwards starting from the partonic scatter, commonly ter- 
minates on a valence quark. This situation is shown in the 
leftmost figure. If the perturbative evolution still terminates 
on a sea (anti)quark or a gluon, as indicated in the other 
figures, one or two additional non-perturbative splittings are 
performed to force the evolution to end with a valence quark. 
The grey-shaded area indicates this non-perturbative region, 
whereas the perturbative parton shower happens in the region 
below. 

2.1 Plain colour reconnection 

A first model for colour reconnection has been imple- 
mented in Herwig as of version 2.5 [39]. We refer to it 
as the plain colour reconnection model (PGR) in this 
paper. The following steps describe the full procedure: 

1. Create a list of all quarks in the event, in random 
order. Perform the subsequent steps exactly once for 
every quark in this list. 

2. The current quark is part of a cluster. Label this 
cluster A. 

3. Consider a colour reconnection with all other clus- 
ters that exist at that time. Label the potential re- 
connection partner B. For the possible new clusters 
C and which would emerge when A and B are re- 
connected (cf . Fig. |4]) , the following conditions must 
be satisfied: 

— The new clusters are lighter, 

mc + mjj < niA + ms , (1) 

where denotes the invariant mass of cluster 
i. 



— C and D are no colour octets. 
If at least one reconnection possibility could be 
found in step 3, select the one which results in the 
smallest sum of cluster masses, mc + mi). Accept 
this colour reconnection with an adjustable proba- 
bility Prcco- In this case replace the clusters A and 
B by the newly formed clusters C and D. 
5. Continue with the next quark in step 2. 

The parameter Prcco steers the amount of colour recon- 
nection in the PGR model. Because of the selection rule 
in step 4, the PGR model tends to replace the heaviest 
clusters by lighter ones. A priori the model is not guar- 
anteed to be generally valid because of the following 
reasons: The random ordering in the first step makes 
this algorithm non-deterministic since a different or- 
der of the initial clusters, generally speaking, leads to 
different reconnection possibilities being tested. More- 
over, apparently quarks and antiquarks are treated dif- 
ferently in the algorithm described above. 

2.2 Statistical colour reconnection 

The other colour reconnection implementation studied 
in this paper overcomes the conceptual drawbacks of 
the PGR model. We refer to this model as statistical 
colour reconnection (SCR) throughout this work. In the 
first place, the algorithm aims at finding a cluster con- 
figuration with a preferably small colour length, defined 
as 

i=l 

where A^ci is the number of clusters in the event and rrii 
is the invariant mass of cluster i. In the definition of the 
colour length we opt for squared masses to give cluster 
configurations with similarly heavy clusters precedence 
over configurations with less equally distributed cluster 
masses. 

Clearly, it is impossible to locate the global mini- 
mum of A, in general, since an event with 100 parton 
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pairs, for instance, implies about 100! w 10^^^ possible 
cluster configurations to be tested. The Simulated An- 
nealing algorithm from Ref. 40 , however, has proven 



useful in solving optimisation problems like this approx- 
imately. The SCR model is an application of this algo- 
rithm with A as the objective function to be minimised. 

The SCR algorithm selects random pairs of clusters 
and suggests them for colour reconnection. Just like in 
the PCR model, clusters consisting of sphtting products 
of a colour-octet state are vetoed. A reconnection step 
which reduces A is always accepted. If the reconnection 
raises the colour length, it is accepted with probability 



p = exp 



A2 — Ai 
T 



(3) 



where Ai and A2 denote the colour lengths before and 
after the reconnection, respectively. This gives the sys- 
tem the possibility to escape local minima in the colour 
length. The "temperature" T is a control parameter, 
which is gradually reduced during the procedure. At 
high temperatures, T > 0{X2 — Ai), the algorithm 
is likely to accept steps which raise A. By contrast, 
lower temperatures imply a small probability for colour- 
length-increasing reconnection steps. 

The transition from high to low temperatures is 
determined by the annealing schedule, which flexibly 
adapts to the number of clusters, A^ci, and to the colour 
length in the event. First, a starting temperature is de- 
termined from the typical change in the colour length, 
AX = A2 — Ai. To this end, a few random dry- run colour 
reconnections S are performed, all starting with the de- 
fault cluster configuration. The initial temperature is 
set to 



t init 



c • medmn{|Z\A|i} 



(4) 



where c is a free parameter of the model. Using the me- 
dian makes this definition less prone to outliers com- 
pared to the mean. The algorithm proceeds in steps 
with fixed temperature. At the end of each tempera- 
ture step T decreases by a factor /, which is another 
free model parameter, with / G (0, 1). Each value of T 
is held constant for aNd reconnection attempts with 
another free parameter a. The algorithm stops as soon 
as no successful colour reconnections happen in a tem- 
perature step, but at most -/Vstcps temperature steps 
are tested. We use the parameters c, a, f and A'stcps, 
which are all related to the annealing schedule, to tune 
the SCR model to data. 

We would like to stress that the annealing model is 
used only as a numerical tool to minimize the colour 
length introduced above and hence give no physical in- 
terpretation to the model parameters themselves. We 



argue later, that merely the idea of minimizing the 
colour length is indeed meaningful and physical. 



3 Characteristics of colour reconnection 

In this section we want to study hadronization-related 
quantities which allow us to understand colour recon- 
nection from an event generator-internal point of view. 
Here, a set of typical values for c, a, / and A'stcps in the 
SCR model, as well as for Proco in the PCR model, was 
used, which was obtained from tunes to experimental 
data, as described below in Sec. |4j 



3.1 Colour length drop 

To quantify the effect of colour reconnection at gener- 
ator level, we define the colour length drop 



Z\if = 1 - 



A 



final 



Ai, 



(5) 



where Ainit and Afinai denote the colour length in an 
event before and after colour reconnection, respectively. 
Ail approximately vanishes in events with Ainit ~ Afinai, 
i.e. with no or only minor changes in the colour length A 
due to colour reconnection. The other extreme, Ai{ « 1, 
indicates a notable drop in A. 

The distribution of Aif for soft inclusive LHC events 
at 7 TeV is shown in Fig.jsf^a). The plain and the statis- 
tical colour reconnection models result in similar distri- 
butions with pronounced peaks at and 1. Note that 
Fig. [5] shows logarithmic plots, so the plateau in be- 
tween the peaks is really low. There is also a small frac- 
tion of events with negative Z\it, though. The colour re- 
connection procedure actually raises A in these events. 
In the SCR algorithm, this can happen since A-raising 
steps are explicitly allowed with a certain probability, 
cf. Eq. ([3]). However, also the PCR algorithm might 
potentially raise A since the reconnection condition, 
Eq. ([I]) , is formulated in terms of the first power of clus- 
ter masses, whereas A is defined as the sum of squared 
cluster masses. As these events are rare, we expect no 
impact on physical observables. 

With soft inclusive hadron-hadron generator set- 
tings there are, generally speaking, two important 
classes of events. One of the two are events where there 
is no notable change in the sum of squared cluster 
masses, A. In another large fraction of events, however, 
colour reconnection causes an extreme drop in A. An ob- 
vious interpretation for this drop is that the colour re- 
connection procedure replaces disproportionally heavy 
clusters by way lighter ones. 
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Fig. 5 Colour length drop in pp and e+e~ collisions. Figure (a) shows Aif using the PGR and the SCR models. The events 
were generated with soft inclusive LHC generator settings at 7 TeV. In (b) we show the colour length drop within the SCR 
model in LHC dijet production with a number of p± cuts, where the cm. energy is also 7 TeV. (c) shows the drop in the 
colour length (using the SCR model) with LEP generator setup running at 189 GeV. We compare dijet events to W boson 
pair production with fully hadronic decays. 




This shift in the cluster mass spectrum, which both 
models aim at by construction, can also be observed 
directly. Figure |6] shows the cluster mass distribution 
before and after colour reconnection. As expected and 
also intended, both CR procedures cause the distribu- 
tion to be enhanced in the low-mass peak region and 
suppressed in its, potentially unphysical, high-mass tail. 

In Fig. [5|b) we show the colour length drop in hard 
dijet events in pp collisions. We observe a notable de- 
crease of large colour length drops, Aif — 1, with in- 
creasing cut on the jet transverse momentum at parton 
level. The reason for this decrease is that higher momen- 
tum fractions are required for the hard dijet subprocess, 
whereas in soft events the remaining momentum frac- 
tion of the proton remnants is higher. Hence clusters 



containing a proton remnant are less massive in hard 
events, which implies less need for colour reconnection. 

The distribution of the colour length drop in e+e^ 
annihilation events looks completely different, as shown 
in Fig. [sjc). We find that colour reconnection has no 
impact on the colour length in the bulk of dijet events. 
We show only the Z\if distribution from the SCR model 
here, but the PGR model yields similar results. These 
results confirm that due to colour preconfinement par- 
tons nearby in momentum space in most cases are com- 
bined to colour singlets already. In events with hadronic 
W pair decays, however, hadrons emerge from two sep- 
arate colour singlets. If there is a phase space overlap 
of the two parton jet pairs, the production of hadrons 
is expected to be sensitive to colour reconnection. We 



address this question later on in Sec. 4.1 Here we want 
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to remark that the fraction of WW events with non- 
vanishing colour length drop is slightly higher than for 
the dijet case. Nevertheless, the vast majority of WW 
events is not affected by colour reconnection, too. 



3.2 Classification of clusters 




Fig. 7 Classification of colour ciusters in a fiadron collision 
event, which, in tfiis example, consists of tfie primary subpro- 
cess (left) and one additional parton interaction. The grey- 
shaded area denotes non-perturbative parts of tlie simula- 
tion. The three clusters represent the cluster classes defined 
in Sec. |3.2| n-type (blue), i-type (red) and /i-type clusters 
(orange)! 



These results generically raise the question which 
mechanism in the hadron event generation is respon- 
sible for these overly heavy clusters. To gain access to 
this issue, we classify all clusters by their ancestors in 
the event history. A sketch of the three types of clusters 
in shown in Fig. [7] 

— The first class are the clusters consisting of partons 
emitted perturbatively in the same partonic subpro- 
cess. We call them h-type (hard) clusters. 

— The second class of clusters are the subprocesses- 
interconnecting clusters, which combine par- 
tons generated perturbatively in different par- 
tonic subprocesses. They are labelled as i-type 
(interconnecting) clusters. 

— The remaining clusters, which can occur in hadron 
collision events, are composed of at least one par- 
ton created non-perturbatively, i.e. during the ex- 
traction of partons from the hadrons or in soft scat- 
ters. In what follows, these clusters are called n-type 
(non-perturbative) clusters. 
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Fig. 8 Cluster fraction functions, defined in Eq. 
dijet events at 7 TeV. 
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First we use this classification to analyse hadron 
collision events as they are immediately before colour 
rearrangement. For that purpose, we define the cluster 
fraction functions 



fairric 



Nairricnt) I 



h—h.i.n 



(6) 



where iVa(mcut) is the number of a- type clusters (a = 
h,i,n) with m > mcut, counted in a sufficiently large 
number of eventtj^ For instance, /i(100 GeV) = 0.15 
says 15 % of all clusters with a mass larger than 
100 GeV are subprocess-interconnecting clusters. By 
construction, /a (mcut) is a number between and 1 for 
every class a. Moreover, the cluster fraction functions 
satisfy 

^ fa{mcut) = 1- 
a—h,i.n 

Figure |8] shows the cluster fraction functions for LHC 
dijet events at y/s = 7 TeV. The fraction of non- 
perturbative clusters increases with rricut £^nd exceeds 
0.5 at JTicut ~ 70 GeV. So for an increasing threshold 
JTicut up to values well beyond physically reasonable 
cluster masses of a few GeV, the contribution of n-type 
clusters becomes more and more dominant. 

A bin-by-bin breakdown to the contributions of the 
various cluster types to the total cluster mass distribu- 
tion is shown in Fig. [9j There are several things to learn 
from those plots. First, non-perturbative n-type clus- 
ters do not contribute as much to the peak region, say 

^Apparently, fa{mcut) is only well-defined for mcut less than 
the maximum cluster mass. On this interval, the series (/a,n), 
with n the number of events taken into account, converges 
pointwise to the function fa ■ This is a more formal definition 
of the cluster fraction functions. 



8 



default clusters 
/i-type clusters 
i-type clusters 
n-type clusters 

after reconnection 
h-type clusters 

I *-type clusters 

n-type clusters 




> 

o 

-a 




Fig. 9 Primary cluster mass spectrum in LHC dijet events at 7 TeV. Figure (a) compares the mass distribution in the 
pre-colour-reconnection stage to the distribution after colour reconnection. The contributions of the three cluster classes are 
stacked. The histograms in (b) merely differ from the ones in (a) in their binning. 



below 6 GeV, as perturbative /i-type and i-type clus- 
ters do. In the high-mass tail, however, n-type clusters 
clearly dominate, as already indicated by the cluster 
fraction functions discussed above. Both their minor 
contribution at low masses and their large contribution 
at high masses do not change after colour reconnection. 
In total, however, the mass distribution is more peaked 
after colour reconnection and the high-mass tail is sup- 
pressed by a factor larger than 10. 



3.3 Resulting physics implications 

The characteristics of clusters that have been studied in 
this section clearly confirm the physical picture we have 
started out with. The colour reconnection model in fact 
reduces the invariant masses of clusters that are mostly 
of non-perturbative origin. These arise as an artefact of 
the way we colour-connect additional hard scatters in 
the MPI model with the rest of the event. 

At this non-pcrturbative level we have no handle on 
the colour information from theory, hence we have mod- 
elled it. First in a very naive way when we extract the 
'first' parton from the proton, but only to account for a 
more physical picture later, where we use colour precon- 
finement as a guiding principle. We therefore conclude 
that our ansatz to model colour reconnections in the 
way we have done it reproduces a meaningful physical 
picture. 



4 Tuning and comparison of the model results 
with data 

In this section we address the question of whether the 
MPI model in Herwig, equipped with the new CR 
model, can improve the description of the ATLAS MB 
and UE data, see Fig. [2j To that end we need to find 
values of free parameters (tune parameters) of the MPI 
model with CR that allow to get the best possible 
description of the experimental data. Since both CR 
models can be regarded as an extension of the cluster 
model f36l, which is used for hadronization in Herwig, 
the tune of Herwig with CR models may require a 
simultaneous rc-tuning of the hadronization model pa- 
rameters to a wide range of experimental data, primar- 
ily from LEP (see Appendix D from Ref. There- 
fore, we start this section by examining whether the 
description of LEP data is sensitive to CR parameters. 

4.1 Validation against e+e^ LEP data 

Already in Section [3] we have seen that the colour 
structure of LEP final states is well-defined by the 
perturbative parton shower evolution. Moreover, the 
CR model does not change this structure significantly. 
Therefore, although CR is an extension of hadroniza- 
tion, we can expect that the default hadronization pa- 
rameters are still valid in combination with CR. This 
was confirmed by comparing Herwig results with and 
without CR against a wide range of experimental data 
from LEP 41-49 . As an example we show a compari- 



son of Herwig without and with CR (using the main 
tunes for both CR methods presented in this paper) to 



two LEP observables in Fig. 10 The full set of plots. 



9 



Rapidity w.r.t. sphericity axes, i/s Mean charged multiplicity 




0123456 90.8 91 91.2 91.4 91.6 



Fig. 10 Comparison of Herwig without CR (red line) and with CR (using the main tunes for both CR methods presented 
in this paper) to exemplary measurements from the DELPHI detector at LEP. 



showing that the LEP data description in Herwig with 
and without CR is of the same quahty, can be found on 
the Herwig and MCplots web pages (50)|5T] . These re- 



sults allow us to factorize the tuning procedure: The 
well-tested default Herwig tune for parton shower and 
hadronization parameters is retained, and only the pa- 
rameters from the CR and MPI models are tuned to 
hadron collider data. However, we have checked each 
tune presented in this paper against LEP results. 

In addition to the analyses used for the hadroniza- 
tion tuning, there are LEP analyses dedicated to colour 
reconnection in W^W~ — iQQ){q<i) events 52-55 , 
originally proposed in Ref. [56]. In those analyses the 
W bosons are reconstructed via kinematic cuts on all 
possible jet pairs in four-jet events. The particle flow 
between jets originating from different bosons was ex- 
pected to be enhanced in Monte Carlo models including 
colour reconnection. However, only moderate sensitiv- 
ity to the tested CR models could be found at the time. 
We have confirmed this with our colour reconnection 



implementations. In Fig. 11 we show the sensitivity of 
the particle flow between the identified jets to the re- 
connection strength in the PCR model, compared to 
DELPHI data from Ref. 52 . We observe a slight im- 



provement in the description of the data. A number of 
apparent outliers in the experimental data, however, in- 
dicate possibly too optimistic systematic errors in the 
experimental analysis. For that reason, no clear con- 
straints on the model can be deduced from the data. 

As the W bosons are produced on shell and signif- 
icantly boosted a,t y/s = 189 GeV, the finite W width 
can cause the two W bosons to travel long distances 
before decaying. In the limit of a very small W width, 
large reconnection effects between the two W systems 
should thus be suppressed in the model. The moderate 



sensitivity of the particle flow to colour reconnections 
implies, however, that colour reconnection effects are 
small in WW events. Note that also the largely van- 
ishing colour length drop in WW events, cf. Fig. [5jc) 
and the discussion in Sec. 3.1 supports this conclu- 
sion. Hence we retain the described generic reconnec- 
tion models also for WW events and do not introduce 
an extra suppression mechanism. 



I 30 
-d 20 



i 15 



DELPHI data 
110 CR 



CR envelope 

PCR with p,eco = 0.54 



\J \J W \J 




0.0 0.5 l.( 
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Fig. 11 Charged-particle flow in hadronic WW events at 
LEP with y's = 189 GeV. The grey band indicates the range 
which is covered by varying the colour reconnection strength 
Preco in the PCR model. The definition of the rescaled angle, 
^reseated, along with a detailed description of the analysis 
can be found in Ref. [521. 



4.2 Tuning to data from hadron colliders 

Now that we have validated the CR models by com- 
parison against LEP data, we are ready to tune their 
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parameters to data provided by hadron colliders. Before 



LHC data was available, the MPI model in Herwig 24 



was tuned by subdividing the two-dimensional param- 
eter space, spanned by the model's main parameters, 
the inverse proton radius squared fj,^ and the minimum 
transverse momentum p™"^, into a grid. For each of the 
parameter points on this grid, the total against the 
Tevatron underlying-event data [T 57 was calculated. 
A region in the parameter plane was found, where sim- 
ilarly good values for the overall could be obtained. 

While tuning the MPI models including colour re- 
connection we are dealing with a larger number N of 
tunable parameters Pi, where = 4 in case of the PGR 
(Pdisrupt, Prcco, P±™ and /z^) and A = 7 in case of the 
SCR model (pdisrupt , P^'" ' A^^i c, / and iVstcps)- Hence 
the simple tuning strategy from above is ineffective. A 
comprehensive scan of 7 parameters, with 10 divisions 
in each parameter would require too much CPU time. 

Instead, we use a parametrization-based tune 
method which is much more efficient for our case. The 
starting point for this tuning procedure is the selec- 
tion of a range [p™'", pf^^^] for each of the A^ tuning 
parameters pi. Event samples are generated for ran- 
dom points of this A'-dimensional hypercube in the 
parameter space. The number of different points de- 
pends on the number of input parameters to ensure 
a well converging behaviour of the final tune. Each 
generated event is directly handed over to the Rivet 
package [58) to analyse the generated events. This al- 
lows the computation of observables for each parameter 
point, which construct the input for the tuning process. 
The obtained distributions of observables for each pa- 
rameter variation are the starting point for the main 
part of the tune, which is achieved using the Professor 
framework 
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Professor parametrizes the generator 
response to the probed parameter points. In that way 
it finds the set of parameters, which fits the selected ob- 
servables best. The user is able to affect the tuning by 
applying a weight for each observable, which specifies 
the impact of the variable for the tuning process. 



4-2.1 Tuning to minimum-bias data 

As we initially were primarily aiming at an improved 
description of MB data, we started by tuning the PCR 
model to ATLAS MB data. Since currently there is no 
model for soft diffractive physics in Herwig, we use 
the diffraction-reduced ATLAS MB measurement with 
an additional cut on the number of charged particles, 
A"ch ^ 6. The observables we used for the tune are 
the pseudorapidity distribution of the charged particles, 
the charged multiplicity, the charged-particle trans- 
verse momentum spectrum and the average transverse 



momentum measured as a function of the number of 
charged particles. All four available MB observables en- 
tered the tune with equal weights. The results of this 
tune are shown by the blue lines in Fig. [12] The bot- 
tom right figure shows that colour reconnection helps 
to achieve a better description of {pT){Nch). Also the 
other three distributions are now well described. We 
conclude that the CR model was the missing piece of 
the MPI model in Herwig-|— I-. We clearly improve the 
description of the pseudorapidity distribution. 



4-2.2 Tuning to underlying-event data 

The next important question was whether the new 
model is able to describe the UE data collected by AT- 
LAS at 7 TeV [i] . The measurements are made relative 
to a leading object (the hardest charged track in this 
case). Then, the transverse plane is subdivided in az- 
imuthal angle (j) relative to this leading object at (f> — 0. 
The region around the leading object, |(/)| < tt/3, is 
called the "towards" region. The opposite region, where 
we usually find a recoiling hard jet, |(/)| > 27r/3, is 
called "away" region, while the remaining region, trans- 
verse to the leading object and its recoil, where the un- 
derlying event is expected to be least 'contaminated' 
by activity from the hard subprocess, is called "trans- 
verse" region. Again, we only focus on the tuning of 
the PCR model here. For the underlying-event tune 
two observables were used: The mean number of stable 
charged particles per unit of T]-(j), {d'^Nch/dr]d(j)), and 
the mean scalar p± sum of stable particles per unit of 
ri-(j), (d^ ^ pt /dry d(/)) , both as a function of p^^'^'^, with 
charged particles in the kinematic range p± > 500 MeV 
and \ri\ < 2.5. 

The resulting tune, named Ue7-2, gives very sat- 
isfactory results not only for the tuned observables 
but also for all other observables provided by ATLAS 
in Ref. (ij. In Figs. [Tsj^c), [wjc) and [l5jc), we show 



(d^Ach/d?7d0) and (d^ ^Pf/dryd^) as a function of 
plead ^ gQQ ]y[gY |;jjg "transverse" , "away" and 

"toward" regions, compared to the Herwig-H- Ue7-2 
results (green line). 

We repeated the tuning process for the UE data col- 
lected by ATLAS at 900 GeV and CDF at 1800 GeV, 
and obtained as good results as for 7 TeV (not shown in 



Figs. 13- 15 for the sake of simplicity). It is worth men- 
tioning that the ATLAS UE observables with the lower 
p± cut on the charged particles, p± > 100 MeV, were 
not available during the preparation of the Ue7-2 tune 
but are also well described by the tune, see Fig. [T6|^c). 
These results can therefore be considered as a predic- 
tion of the model. 
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Charged particle i] at gooGeV, track pj_ > 500 MeV, for N^h > 6 Charged mtiltiphcity > 6 at gooGeV, track pj^ > 500 MeV 




Charged particle p± at gooGeV, track pj_ > 500 MeV, for N^h > 6 Charged {pj_) vs. N^h at gooGeV, track > 500 MeV, for N^h > 1 




Fig. 12 Comparison of Herwig 2.4.2 witiiout CR and HerwiG 2.5 witli PGR to ATLAS minimum-bias distributions at 
Vs = 0.9 TeV witii A^ch > 6, > 500 MeV and \rj\ < 2.5. Tlie ATLAS data was publislied in Ref. [H]. 



Figure [17] shows the angular distributions of the 
charged-particle multiplicity and J2 P-L > with respect to 
the leading charged particle (at = 0). The data sets 
are shown for four different cut values in the transverse 
momentum of the leading charged particle, p'j^^'^. With 
increasing cut on p^^'^'^, the development of a jet-like 
structure can be observed. The overall description of 
the data is satisfactory but we can also see that the 
description improves as the lower cut value in p^^'^'^ in- 
creases as then the description is more driven by per- 
turbation theory. The full comparison with all ATLAS 
UE and MB data sets is available on the Herwig tune 
At this stage different UE tunes were manda- 



50 



page 

tory for different hadronic centre-of-mass energies -^/s. 
In the next section we address the question of whether 



an energy-independent UE tune can be obtained using 
the present model. 

4.2.3 Centre-of-mass energy dependence of UE tunes 

To study the energy dependence of the parameters 
properly, we examine a set of observables at different 
collider energies, whose description is sensitive to the 
MPI model parameters. The experimental data should 
be measured at all energies in similar phase-space re- 
gions and under not too different trigger conditions. 
These conditions were met by two UE observables: 
{(PNch/dr]d(j)) and (d^ ^ Pi/dry d(/)) , both measured as 
a function of p^l'"^ (with p^l""^ < 20 GeV) by ATLAS at 
900 GeV and 7000 GeV (with p± > 500 MeV) and by 
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CDF at 1800 GeV. Let us first focus on the PGR model. 
In this case we have four free model parameters, Pdisiupt, 
Preco , P±™ and /i^ . For each hadronic centre-of-mass en- 
ergy we performed independent four-dimensional tun- 
ings. Note that p^^'^'^ denotes the transverse momentum 
of the hardest track in the case of ATLAS, whereas the 
CDF underlying-event analysis uses the p± of the lead- 
ing jet, which we call p^^^'^ here, as well. 



Table 1 Tune values for p^'". All other model parameters, 
which do not depend on the cm. energy, are summarized in 
Tab.E] 



Figure 18 shows the spread of the tuning results for 
each parameter against Professor's heuristic x^- In the 
first row we present results for 900 GeV and in the sec- 
ond row for 7 TeV. Each point is from a separate tune, 
made using various combinations of generator runs at 
different points in the parameter space. We see that 
the parameters are not well constrained and are sen- 
sitive to the input Monte Carlo (MC) runs. This is 
due to what we have already seen during the tuning 
of the MPI model without CR f23','24','60| to Tevatron 
data, namely the strong and constant correlation be- 
tween p""" and fi^. This correlation reflects the fact 
that a smaller hadron radius always balances against a 
larger p± cutoff, as far as the underlying- event activity 
is concerned. With one of these two parameters fixed, 
the remaining parameters are much less sensitive to the 
input MC runs. 

The most important information we can see on these 
figures is that the experimental data for the two differ- 
ent cm. energies (900 GeV and 7 TeV) cannot be de- 
scribed by the same set of model parameters. More pre- 
cisely, the experimental data prefers different p™'" val- 
ues for different hadronic centre-of-mass energies, while 
the rest of the parameters may perhaps remain inde- 
pendent of the energy. This observation led us to the 
creation of energy-extrapolated UE tunes, named UE- 
EE-3, in which all parameters are fixed except for p™'", 
which varies with energy. We summarize the tune val- 
ues for p™'" at different energies in Tab. [l] The other 
model parameters, which do not depend on the cm. 
energy, are given in Table [2] 

Since by construction the MPI model depends 
on the PDF set, we created two separate energy- 
extrapolated tunes for the CTEQ6L1 and MRST LO** 
PDFs. In general, both tunes yield similar and satisfac- 
tory descriptions of experimental datsj^ As an example 
see Fig. 16 in which we compare the ue-ee-3 and UE- 
ee-3-CTEQ6l1 tunes to ATLAS UE observables, mea- 
sured in all three regions (toward, transverse and away). 

We repeated this procedure also for the SCR model. 
However, since in this case the tuning procedure was 
more complicated, as explained below, we concentrated 
on one PDF set only, namely CTEQ6L1. The first obvi- 
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1.55 


2.26 


2.75 


ue-ee-3-cteq6l1 


1.86 
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1.58 


2.14 
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ous complication was the larger number of parameters 
to tune. The second complication was associated with 
the fact that one of the tuning parameters, A'stcps, is an 
integer number. The current version of Professor, how- 
ever, does not provide such an option, instead it treats 
all parameters as real numbers. Therefore, we decided 
to carry out fifty separate tunes for different fixed val- 
ues of iVsteps, starting from 1 to 50. The last problem 
that we encountered, which is probably associated with 
the two previously mentioned problems, was that for 
some parameter values the predictions from Professor 
were significantly different from the results we received 
directly from Herwig-|— }- runs. Initially, we increased 
the order of the interpolating polynomials from second 
to fourth, which should improve Professor's predictions, 
but this did not improve the situation. Therefore, we 
first identified regions of the parameter space where 
this problem appeared most frequently and then ex- 
cluded these from the tuning procedure. As a result, we 
obtained an energy-extrapolated underlying-event tune 
for the SCR model, which we caU ue-ee-SCR-CTEQ6l1. 



^The only diflference is that the GTEQ6L1 gives more flexi- 
bility in the choice of the model parameters. 



In Figures 13 14 and 15 we show a comparison 
of the PGR and SCR energy-extrapolated (GTEQ6L1) 
tunes and the Ue7-2 tune against {(i^ Ncu / drj d(f>) and 
(d^ J2Pt/drjd<j>) as a function ofp^l'"^ forpj_ > 500 MeV 
in all three regions (toward, transverse and away) and 
at three different collider energies. We can see that the 
quality of the data description is high and at the same 
level for all tunes. Nevertheless, we favour the SCR 
model as here we have a clearer physics picture and 
a more flexible model. 

In the last step, we parametrized the pf"^ depen- 
dence. In a first attempt we have chosen a logarithmic 
function to extrapolate p™'" to energies different from 
the tune energies. Therefore we fitted a function of the 
form p™'"(s) = A log{y/s/B), where A and B are free 
fit parameters, to the three p™'" values obtained in the 
UE-ee-3 tune. The fit is shown in Fig. [T9j Based on 
this, we provide UE tunes for cm. energies the LHC 
was or will be operating at. Since the logarithmic form 
is not very stable for lower energies, we have replaced 
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Table 2 Parameters of the energy-extrapolating 
underlying-event tunes. The last two parameters de- 
scribe the running of p™'" according to Eq. ([7|. 
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Fig. 13 ATLAS data at 900 GeV (1st column), CDF data at 1800 GeV (2nd column) and ATLAS data at 7 TeV (3rd column), 
showing the multiplicity density and Yl P± of the charged particles in the "transverse" area as a function of rpj^^ data 

is compared to the ue7-2, ue-ee-3-CTEQ6l1 and ue-ee-SCR-CTEQ6l1 tunes. 
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Fig. 16 ATLAS UE data at 7 TeV for the lower px cut {p±_ > 100 MeV) for the transverse (1st column), towards (2nd 
column) and away (3rd column) areas, showing the multiplicity density and '}2p± of the charged particles as a function of 
plead rpj^g data is compared to the ue7-2, ue-ee-3 and ue-ee-3-CTEQ6l1 tunes. 
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Fig. IT Azimuthal distribution of the charged particle multiplicity {left panel) and '^p± densities [right panel), with respect 
to the direction of the leading charged particle (at (f> = 0), for < 2.5. The densities are shown for pij=3,d y ^ GeV, 
plead ^ 2 GeV, p'j;^'^ > 3 GeV and p^l"^"^ > 5 GeV. The data is compared to the ue7-2 tune. 
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this ansatz with a power law, see also e.g. 61 



„min 



Eo 



(7) 



This is the default parametrization of the energy de- 



pendence from HERWIG-I-+ release 2.6 62 . The default 



value of Eq is 7 TeV. For the collider energies at consid- 
eration in our tunes there are no significant differences 
in all observables due to this change. The values for b 
and p^^Jg, which we find by fitting Eq. ^ to the p'f'' 
values from Tab.jlJ are summarized in thelast two rows 
of Table El 



In the future, we plan to study the energy scal- 
ing of the model parameters using diffraction-reduced 
minimum-bias data, and then, in more detail, the pos- 
sibility of achieving a common description of the UE 
and MB data, cf. 63 . As can be seen in Fig. 21 the 



UE tunes fail to reproduce the ATLAS MB data at 
7 TeV with a less tight cut on the number of charged 
particles, A^ch > 2, and where all charged particles with 
p± > 100 MeV are taken into account. This is not sur- 
prising, however, since Herwig lacks a model for soft 
diffractive physics so far. That explains the poor de- 
scription of both the charged multiplicity and the aver- 
age transverse momentum in the low-multiplicity bins. 
On the other hand, the unsatisfactory description of 
the shown observables in the high multiplicity tail may 
indicate missing physics in the model. It might, how- 
ever, as well be resolved by a dedicated MB tune. Both 
possibilities are left for future work. In particular, we 
point out the lack of an explicit model for diffractive 
events. A more complete description of the MB data 
should also include a modelling of these. 
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Fig. 19 Energy extrapolation of p"]^'" in the ue-ee-3- 
Cteq6l1 tune. 



For the preparation of the energy-extrapolated 
tunes we did not use any MB observables. Nevertheless, 
we show a comparison of the ue-ee-3-CTEQ6l1 and UE- 
ee-SCR-CTEQ6l1 tunes to the diffraction-reduced AT- 
LAS MB data at 7 TeV (with N^^ > 6) in Fig. [20 
We see that the data is described slightly better by the 
SCR than by the PGR tune. Moreover, although these 
data sets were not taken into account in both tunes, 
the results are close to the experimental data. 



5 Conclusions 

We have introduced two different models for non- 
perturbative colour reconnections in Herwig. The 
models are of slightly different computational complex- 
ity but give very similar results. The tuning results have 
shown that the SCR is preferred to have parameters 
that force a quick 'cooling' of the system and there- 
fore results in a very similar model evolution as in the 
simpler PCR model. We therefore consider the PCR 
as a special case of the SCR model for quick cooling 
and keep the SCR as the more flexible model for future 
versions of Herwig-I--)-. As a consequence, we under- 
stand that the data demands a final state that does not 
obey a perfectly minimized colour length. We interpret 
this as a model limitation. At some point the picture 
of colour lines breaks down. Colour lines themselves 
are only a valid prescription up to leading order in the 
Nc — ^ oo limit. Furthermore, the mechanism addresses 
the non-perturbative regime where the picture of the 
colour triplet charges themselves is already a model by 
itself and possibly completely washed out. 

We have studied the mechanism of colour recon- 
nection in detail and found that in fact the non- 
perturbative parts of the simulation demand the colour 
reconnection mechanism in order to repair the lack of 
information on the colour flow. The intuitive picture we 
have based our model on could be verified. The idea of 
colour preconfinement is meaningful in the context of 
the hadronization model and has to be rectified when a 
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Fig. 20 Comparison of the ue-ee-3-CTEQ6l1 and ue-ee-SCR-CTEQ6l1 tunes to ATLAS minimum-bias distributions at ^/s = 
7 TeV, with Wch > 6, > 500 MeV and \r!\ < 2.5. 
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Fig. 21 Comparison of underlying-event tunes to presumably diffraction-enhanced MB observables, measured by ATLAS at 
= 7 TeV, with A^ch > 2, px > 100 MeV and |j7| < 2.5. 
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model of multiple partonic interactions is applied with- 
out further information on the colour structure in be- 
tween the multiple scatters. 

Furthermore, we have shown that by tuning the 
MPI model with CR we can obtain a proper description 
of non-diffractive MB ATLAS observables. We present 
the energy-extrapolated tune ue-ee-3, which is an im- 
portant step towards the understanding of the energy 
dependence of the model. Finally, we have unified the 
different tunes of the MPI model in Herwig-|— I- into 
a simple parametrization of the p""" dependence in a 
way that allows us to describe data at different ener- 
gies with only one set of parameters. News concerning 
Herwig tunes are available on the tune wiki page 50 
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